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ABSTRACT 

Mutations in Pol y represent a major cause of human 
mitochondrial diseases, especially those affecting 
the nervous system in adults and in children. 
Recessive mutations in Pol y represent nearly half 
of those reported to date, and they are nearly uni- 
formly distributed along the length of the POLG1 
gene (Human DNA Polymerase gamma Mutation 
Database); the majority of them are linked to the 
most severe form of POLG syndrome, Alpers- 
Huttenlocher syndrome. In this report, we assess 
the structure-function relationships for recessive 
disease mutations by reviewing existing biochem- 
ical data on site-directed mutagenesis of the 
human, Drosophila and yeast Pol ys, and their 
homologs from the family A DNA polymerase 
group. We do so in the context of a molecular 
model of Pol y in complex with primer-template 
DNA, which we have developed based upon the 
recently solved crystal structure of the apoenzyme 
form. We present evidence that recessive mutations 
cluster within five distinct functional modules in the 
catalytic core of Pol y. Our results suggest that 
cluster prediction can be used as a diagnosis- 
supporting tool to evaluate the pathogenic role of 
new Pol y variants. 



INTRODUCTION 

Mitochondrial DNA polymerase, Pol y, is the sole known 
DNA polymerase in animal mitochondria and is respon- 
sible for mitochondrial DNA (mtDNA) replication and 
repair (1,2). The human enzyme is a heterotrimer consisting 
of a catalytic subunit, Pol yA, and dimer of an accessory 
subunit, Pol yB (3). Pol yA is a member of the family A 
DNA polymerase group to which bacterial DNA Pol I 
belongs (1). ft comprises three domains: an N-terminal 
domain containing 3'— *-5' exonuclease (exo) activity, a 
spacer domain and a C-terminal domain, containing 
5'— >3' DNA polymerase (pol) activity in three subdomains 
termed the palm, fingers and thumb. Pol yA also bears a 
5'-deoxyribose phosphate lyase activity (4), but the location 
of its active site is unknown. The accessory subunit, Pol yB, 
serves as a processivity factor, enhancing the DNA-binding 
affinity and catalytic activities of Pol yA (5,6). The crystal 
structure of human Pol y in its apoenzyme form was solved 
recently (PDB code 3tKM) (7). 

Mutations in Pol y represent a major cause of human 
mitochondrial diseases, especially those affecting the 
nervous system in adults and in children. Dominant mu- 
tations typically cause adult-onset myopathies and 
encephalopathies (8,9), whereas recessive mutations 
result in severe adult or juvenile onset ataxia-epilepsy syn- 
dromes (MIRAS, SCA-E, SANDO) (10-12), or 
devastating early-childhood Alpers syndrome (Alpers- 
Huttenlocher syndrome) characterized by intractable 
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epilepsy, psychomotor retardation and liver failure that 
leads to early death (13,14). On the molecular level, 
POLG syndromes are accompanied by either 
tissue-specific mtDNA depletion, deletions or combin- 
ation of both (15,16). To date, more than 145 POLG1 
disease mutations have been identified (Human DNA 
Polymerase gamma Mutation Database, http://tools 
.niehs.nih.gov/polg/), of which more than half are reces- 
sive. The functional consequences of dominant mutations, 
which affect primarily catalytic residues in the pol domain, 
have been explained by effective competition for the DNA 
substrate by the mutant Pol y with the wild-type enzyme 
(7,17,18), and by site-specific stalling of mutant DNA 
polymerase (18). Studies that explain effects of individual 
recessive mutations on protein function are limited (19- 
22). The majority of the reported recessive mutations are 
linked to the most severe form of POLG syndrome, 
Alpers-Huttenlocher syndrome. Manifestation of Alpers 
disease typically requires the presence of at least two re- 
cessive, compound heterozygous mutations, and is 
accompanied primarily with mtDNA depletion (16,23). 
The detailed mechanisms of pathogenesis and mtDNA 
depletion are unknown. 

The assignment of a newly identified amino acid substi- 
tution as a pathogenic mutation is generally based on the 
absence of the variant in the normal population, conser- 
vation of the site among species and segregation in 
families. These assignments are particularly challenging 
because of the extensive variation in the amino acid se- 
quences of Pol yA in the normal population. To date, data 
accumulated on recessive disease mutations show that 
they are almost uniformly distributed among the three 
structural domains of the catalytic subunit and in most 
cases, it is unclear which property of the enzyme is 
affected, and how it contributes to mtDNA deletion or 
depletion. 

The present study aims to assess the structure-function 
relationships for recessive disease mutations, by reviewing 
existing biochemical data on site-directed mutagenesis of 
the human, Drosophila and yeast Pol ys, and their 
homologs from the family A DNA polymerase group; 
we do so in the context of the recently solved crystal struc- 
ture of human Pol y (7), onto which we have modeled 
primer-template DNA (ptDNA). We present evidence 
that recessive mutations cluster within five distinct func- 
tional modules in the catalytic core of Pol y that we des- 
ignate as 'Alpers Clusters 1-5'. We note that our analysis 
of the recessive mutations found in compound heterozy- 
gous form in Alpers patients reveals that a severe disease 
manifestation is typically caused by a combination of at 
least two mutations from different clusters. Our results 
suggest that cluster prediction can be used as a 
diagnosis-supporting tool to evaluate the pathogenic role 
of new Pol y variants. 



APPROACH TO COMPARATIVE STRUCTURAL 
ANALYSIS 

We docked ptDNA into the putative DNA-binding 
channel of the apoenzyme form of the human Pol y 



structure (PDB code 3IKM) (7) by superposition of the 
closed ternary complex of T7 Pol bound to ptDNA and 
dNTP (PDB code 1T8E) (24). The coordinates of the re- 
sulting Pol y: DNA model are provided in Supplementary 
Data. To ensure reliable positioning for subsequent struc- 
tural analysis of Alpers recessive mutations, we evaluated 
three different alignments of the Pol y and T7 palm 
domains that are presented in Supplementary Figure SI. 
First, we aligned the palm domain of Pol yA (PDB code 
3IKM, chain A, residues 815-910 and 1095-1239) with 
that of T7 Pol (PDB code 1T8E, chain A, residues 409- 
487 and 611-704). Additionally, we aligned the two struc- 
tures by superposition of their 12p-loop-13p motifs 
from the palm subdomain, encompassing residues 
Gl 127-E1 144 in Pol yA and F646-E663 in the T7 catalytic 
core [for secondary structure element assignment see (25)]. 
This part of the central B-sheet in the palm subdomain is 
the most structurally conserved element between DNA 
polymerases in the family A DNA polymerase group 
(25), and it superposed with an RMSD of 1.906 A for 
Pol yA and T7. Alignment showed that the functionally 
significant secondary structure elements (SSEs) of the pol 
domain of Pol yA, including helixes aO, aL and ocQ, as 
well as the loop corresponding to the 7B-loop-8B motif in 
T7 Pol, are tilted away from the active site, as compared 
to the analogous SSEs from T7 Pol and Klenow. 
This suggests conformational rearrangements in the Pol 
y catalytic core upon DNA binding. To model the 
ptDNA in the putative DNA-binding channel of Pol yA, 
and to identify residues that interact with the DNA 
duplex, we performed a comparative analysis of the T7 
Pol (24) and Klentaq (26) DNA complexes in both the 
open and closed states. We sought to identify SSEs in 
the pol domain that fulfill the requirements of interaction 
with DNA and preservation of conserved positions 
between the enzymes. As a result, the Q-helix and the 
7B-loop-8B motif were identified, and a third alignment 
of Pol yA and T7 Pol was based on superposition of 
their aQ helices (residues M1093-E1122 in Pol y and 
P606-E635 in T7 Pol), and the hairpins formed by the 
7B-loop-8B (residues T846-V855 in Pol y and P422-T431 
in T7 Pol). The RMSD for the superposed hairpins was 
1.397 A. 



CLUSTERING OF ALPERS DISEASE MUTATIONS 
WITHIN THE CATALYTIC CORE OF POL y 

Recessive mutations are distributed along the length of 
the POLG1 gene sequence (http://tools.niehs.nih.gov/ 
polg/), but they cluster within the tertiary structure 
of Pol yA to distinct regions within the catalytic 
core (Figure 1). Five functional modules, termed clusters 
1-5, were assigned within Pol yA as follows: Cluster 1 
(in green) locates within the pol domain and comprises 
largely residues affecting DNA polymerase activity; 
Cluster 2 (in yellow) represents residues lining the 
upstream DNA-binding channel; Cluster 3 (in red) is 
associated with a novel structural motif in the fingers 
subdomain, which we propose confers partitioning 
of the DNA substrate between the pol and exo 
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Cluster 1 



Figure 1. Alpers disease mutations cluster within functional modules in the catalytic subunit of Pol y. Upper panel: schematic diagram of the 
POLG1 gene showing the distribution of recessive Alpers disease mutations (Human DNA Polymerase gamma Database, http://tools.niehs.nih.gov/ 
polg/). AID and IP in the spacer domain refer to the accessory (subunit) interacting and intrinsic processivity subdomains, respectively, that are 
discussed in the text; NTD refers to the N-terminal domain. Lower panel: tertiary structural representation of the apoenzyme form of Pol y [PDB 
code 3IKM, (7)] with modeled DNA, identifying the positions of five functional modules (shown in mesh) that are defined by clusters of amino acid 
residues (shown as spheres) affected by Alpers disease mutations as follows: Cluster 1, green; Cluster 2, yellow; Cluster 3, red; Cluster 4, blue; Cluster 
5, cyan. The domains of Pol yA are shown as surface representations, and in part as secondary structural elements (SSEs) that are colored as 
depicted in A. The proximal and distal accessory subunits are shown as surface representations in light and dark gray, respectively. Primer-template 
DNA was docked as described in the text and is displayed as orange ribbons. 



sites; Cluster 4 (in blue) lies on the periphery of the exo 
domain and mediates interactions with the distal Pol yB, 
stabilizing the Pol y:DNA complex; and Cluster 5 (in 
cyan) locates within the spacer domain and represents a 
region that we propose is involved in replisome 
interactions. 



ALPERS CLUSTER 1: RESIDUES AFFECTING 5 -3 
DNA POLYMERASE ACTIVITY 

Catalysis of template-directed nucleotide polymerization 
in DNA polymerases proceeds via a phosphoryl transfer 
reaction assisted by two Mg 2+ ions (27,28). Incorporation 
of the correct dNMP is mediated by a nucleotide-induced 
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conformational shift of the fingers subdomain between an 
open and closed state; the closed state results in phos- 
phoryl transfer when correct Watson-Crick base pairs 
between the incoming nucleotide and a templating base 
can fit stably in the active site (29,30). Similarities in the 
overall mechanism of DNA polymerases suggest a 
conserved structure of their active sites and provide a 
basis for comparative structural analysis and extrapola- 
tion of site-directed mutagenesis data from one DNA 
polymerase to its close homologs. The amino acid 
residues that we attribute to Alpers Cluster 1 contribute 
to nucleotide polymerization per se, to shaping the overall 
architecture of the pol active site, or to positioning the 
primer template within the active site (Figure 2A). 

Catalytic residues 

The largest group of recessive Alpers mutations map to 
the pol domain in POLG1 (Figure 1), and a number of the 
affected amino acids are likely involved in catalysis by 
Pol y. Residues El 136 and K1191 surround the highly 
conserved residues D890 and D1135 (Figure 2B), which 
locate at the base of the pol active site and have been 
shown to coordinate the two magnesium ions required 
for catalysis (7). In vitro mutagenesis of residue E883 in 
Klenow (El 136 in Pol y) resulted in decreased pol rate, 
fccat and affinity for dNTPs and DNA (31,32), which fits 
with the recessive nature of this mutation and its effect on 
mtDNA copy number in vivo (16,23). Comparative struc- 
tural analysis of Pol y and its close homologs revealed that 
E895 likely confers discrimination against ribonucleotides, 
and it is involved in aligning the incoming dNTP within 
the catalytic site, together with the catalytic residues Y955 
and Y951 [Figure 2C (33)]. Interestingly, E895 also 
occupies part of the dNTP-binding pocket, and whereas 
E895G is a recessive mutation, Y955C is dominant. 
In vitro mutagenesis of Pol y showed that the Y955C 
variant exhibited dramatically reduced catalytic activity, 
but had the same affinity for DNA as the wild-type 
enzyme (33). In a disease situation, these effects cannot 
be compensated for by the wild-type allele and as a 
result, Y955C causes a dominantly inherited adult-onset 
mitochondrial myopathy (autosomal dominant progres- 
sive external ophthalmoplegia) associated with mtDNA 
deletions (34). El 136 is part of one of three conserved 
sequence motifs in the palm subdomain of family A 
DNA polymerases (the PolC motif, Figure 2B). Its coun- 
terparts E655 in the T7 Pol (25) (Figure 2B) and E786 in 
the Klentaq (26) structures are not in contact with primer 
or incoming dNTP, but mutagenesis of Klenow showed 
that this residue is critical for catalysis (31). The ternary 
complex of T7 Pol reveals an electrostatic interaction 
between E655 and H704, and suggests that the role of 
E655 is to position H704 in close proximity to the phos- 
phate moiety of the 3' nt in the primer strand, and it was 
shown to be essential for catalysis (25). K1191 of Pol y 
occupies the equivalent position of H704 in T7 Pol, sug- 
gesting that it would interact with El 136 in a similar 
manner upon binding of the primer template to facilitate 
catalysis. 



In contrast to the rigid pocket formed by the palm 
active site residues, the opposite half of the active site is 
highly flexible to allow a fast transition between open and 
closed conformations of the fingers subdomain. This con- 
formational asymmetry provides the structural basis for 
the high affinity of DNA polymerases for the primer- 
template junction: the template must be bound firmly 
and consistently each catalytic cycle to assure the fingers 
sufficient freedom to examine each incoming nucleotide. 
Several Alpers mutations have been reported in this region 
of the fingers, scattered mainly along the O-helix 
(Figure 2C). Biochemical studies on the A957S mutant 
showed mild defects in k cat and dNTP binding, but a 
modest increase in DNA-binding affinity (33). A957 
locates in the loop between helices O and Ol, within one 
of the three hinge regions, the 'GAG' hinge (residues 
G956/A957/G958), which in related DNA polymerases is 
known to confer the flexibility necessary for closing and 
opening of helixes O and Ol (27,35). Glycine to alanine 
substitution in the GAG hinge in T7 Pol resulted in 
complete loss of catalytic activity, which can be attributed 
to complete or diminished ability to shift between the open 
and closed states (36). Mutations affecting the hinge 
region likely introduce steric clashes that would limit the 
extent of conformational shifts. Interestingly, different 
amino acid replacements for A957 in Pol y do not have 
similar consequences: A957S causes dominantly inherited 
adult-onset myopathy with multiple mtDNA deletions, 
whereas an A957P substitution has been reported only 
as a recessive, compound heterozygous mutation in 
patients with Alpers syndrome and mtDNA depletion. 
A957S showed an increased affinity for DNA (33), a con- 
sequence that is consistent with a slower open-close rate, 
and could potentially cause pol stalling or dissociation. In 
contrast, the severe defect caused by A957P suggests that 
the presence of proline in the hinge abolishes the flexibility 
between helixes O and Ol, a defect that we predict will 
result in decreased affinity for both dNTP and DNA and 
may prohibit the formation of a productive ternary 
complex. The loop between helixes O and Ol has been 
shown to form the pre-insertion site in Geobacillus 
stearothermophilus Klenow to accommodate the ssDNA 
template and prevent its premature entry into the active 
site (35). However, the available library of DNA polymer- 
ase structures reveals variability in the pre-insertion site, 
making it difficult to assign a critical role. Residue K947, 
which is associated with Alpers syndrome, is located on 
the O-helix adjacent to R943. The corresponding residues 
in T7 Pol (K522 and R518, respectively) contact the tri- 
phosphate moiety of the incoming nucleotide in the closed 
ternary complex (PDB code 1T8E). These electrostatic 
interactions provide most of the driving force for 
O-helix movement, and mutations in these positions will 
weaken interactions. 



Architectural residues 

Optimal catalysis requires not only the residues interacting 
directly with dNTP and primer template, but also those 
that maintain the architecture of the pol active site. 
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Figure 2. Alpers Cluster 1 mutations affect the 5'— 3' DNA polymerase activity of Pol y. Amino acid residues affected by Alpers Cluster 1 mutations 
in POLG1 are shown as green spheres. Other Pol y residues that are discussed in the text are shown in brown, and T7 residues are shown in red. Pol 
domain SSEs are shown in pink according to the schematic shown in Figure 1; ptDNA is indicated by orange (template) and brown (primer) strands. 
The incoming dNTP is shown in blue. Mg~ + ions are shown as small gray spheres. A, upper panel, overview of the positions of Alpers Cluster 1 
mutations with dashed black lines indicating the regions described in the text and depicted in B-D; A, lower panel, overview of the positions of 
Alpers Cluster 1 mutations relative to ptDNA and incoming dNTP; B-D, positioning of Pol y residues in the apoenzyme form [PDB code 3IKM (7)] 
and T7 Pol residues in its closed ternary complex [PDB code 1T8E (25)] relative to ptDNA and incoming dNTP, with the dashed arrow showing the 
expected movements of the Pol y residues upon formation of a closed complex: B, Alpers mutations surrounding the Mg 2+ -binding residues; C, 
Alpers mutations affecting O-helix movement and dNTP binding, with the T7 Pol O-helix in its closed conformation superimposed in red; D, Alpers 
mutations affecting the RR loop and the surrounding residues. 



We suggest that the pathogenic role of the Alpers muta- 
tions Q879H, T885S, L886P, G888S, El 143 and D1184N 
can be attributed to modifications in the local (3-sheet 
architecture surrounding the catalytic residues in the 
palm (Figure 2A, upper panel). Defects are predicted to 
reduce pol rate, and it appears that the closer the mutated 
architectural residue is to the catalytic site, the more 



pronounced its effect. In support of this, in vitro muta- 
genesis of Q879 and T885, located a distance of ~15A 
from the catalytic residues, caused only a 2-fold decrease 
in pol rate (20). In contrast, mutagenesis of E883 
located at the beginning of the strand (313 within the 
catalytic site in Klenow (El 136 in Pol yA) reduced k cat 
26-fold (31). 



Nucleic Acids Research, 2011, Vol. 39, No. 21 9077 



The Alpers mutation D930N (Figure 2C) was studied 
in vivo in yeast, and led to complete loss of mtDNA (37). 
The corresponding residue in T7 Pol (D504) interacts with 
triphosphate-binding residue R518 (R943 in Pol y) in the 
closed conformation (PDB code 1T8E), and therefore con- 
tributes indirectly to correct dNTP binding. Residues 
T914 and L966 located in the fingers subdomain 
(Figure 2C) are conserved only among Pol ys, and are 
linked to Alpers syndrome. Our structural analysis 
predicts that mutations in these residues will likely com- 
promise the stability and conformation, and/ or the tran- 
sition rate between the open and closed states of the 
fingers. In sum, we propose that the primary effect of mu- 
tations surrounding the O-helix residues is altered affinity 
for dNTPs, which decreases pol rate. 

Residues conferring a high affinity of the pol site for the 
primer-template junction 

Docking of DNA onto the Pol y structure shows that the 
DNA-binding channel of the enzyme encompasses ~20 bp 
of DNA duplex, whereas those in T7 Pol and Klentaq are 
relatively short and interact with ~8 bp of DNA duplex 
(24-26). In Pol y, similar to other family A DNA poly- 
merases, the DNA-binding channel starts with residues 
located in the p7-loop-P8 motif in the palm, which form 
extensive contacts with the first four base pairs in the 
ptDNA. In Pol yA, this fragment encompasses residues 
842-856 and appears as a loop in the crystal structure 
with the current resolution of 3.24 A (7). Because the 
(37-loop-P8 motif has two arginines in its tip, R852 and 
R853, the former of which is conserved only among Pol 
ys, we designated it as the 'RR loop' (Figure 2D). Several 
recessive mutations affect the RR loop: G848S, T849H, 
T851A, R852C and R853Q, as well as the surrounding 
residues H1134R and HI HOY. In vitro mutagenesis of 
G848, T851, R852 and R853 in human Pol y (20), and 
R668 in Klenow ((31,32), corresponding to R853 in Pol 
y), caused a dramatic decrease in catalytic activity and 
DNA-binding affinity. Polesky et al. (32) highlighted the 
complex effect of these mutations, noting that a specific 
amino acid can bind a critical part of the substrate, 
contributing to the overall affinity of the enzyme for 
ptDNA, and to catalysis by positioning the substrate in 
the active site. Comparative structural analysis of Pol y 
homologs solved in ternary complexes with bound DNA 
prompts us to propose that the primary function of the 
RR loop is in binding and alignment of ptDNA within the 
DNA-binding channel. HI 134 locates adjacent to the RR 
loop in the tertiary structure (Figure 2D). The correspond- 
ing histidines in Klenow (31,32), T7 Pol (25) and Klentaq 
(26) were shown to be essential for DNA binding, and for 
coordination of the first nucleotide in the primer strand. 
Analysis of Pol y suggests that Ql 102 from the Q-helix 
(Q615 in T7 Pol) and R853 (R429 in T7 Pol) from the RR 
loop coordinate the first base pair in the ptDNA, and 
therefore may be crucial in sensing a mispair. T851 
(T427 in T7 Pol) likely participates indirectly by position- 
ing Q1102. T849 is predicted to coordinate the DNA 
duplex by interaction with the phosphate moiety of the 
third nucleotide from the 3'-end of the primer. In the 



Pol y apoenzyme structure, the side chains of the 
residues from the RR loop are oriented differently than 
the orthologous residues in the T7 Pol ternary complex, 
which suggests that DNA and dNTP binding affect the 
positions of the residues that coordinate them. For 
example, R853 in the Pol y apoenzyme interacts with the 
metal-binding residue Dl 135 as was noted by Stumpf and 
Copeland (17) and with E895, but in complex with DNA, 
R853 would shift to interact with the base of the 3' nt of 
the primer strand, as in T7 Pol (24-26). A distinguishing 
feature of the RR loop, in comparison with the 
orthologous p7-loop-P8 motif in other family A DNA 
polymerase group members, is the electrostatic interaction 
between R852 and D1107, which may confer additional 
stability to the RR loop in Pol y (Figure 2D). Disease 
mutations in this domain affect primarily DNA binding 
and positioning in the active site channel, manifesting 
defects in both DNA-binding affinity and in catalytic 
efficiency. 



CLUSTER 2: RECESSIVE MUTATIONS AFFECTING 
THE UPSTREAM DNA-BINDING CHANNEL 

Processivity of a DNA polymerase depends both on 
DNA-binding affinity and on the rate of DNA polymer- 
ization. Structural analysis shows that the mutations af- 
fecting interaction of Pol y with the upstream DNA 
duplex are located within the spacer domain (Figure 1). 
This would argue that the AID [accessory (subunit)- 
interacting domain] and IP (intrinsic processivity) 
subdomains adopt a different conformation in Pol y: 
DNA complexes. To evaluate the potential impact of re- 
cessive spacer region mutations on DNA binding, we con- 
sidered results of in vitro mutagenesis of spacer domain in 
Drosophila Pol y (38) and in Saccharomyces cerevisiae 
Mipl (22) (Figure 3). Analysis of the Pol y structure 
with modeled ptDNA reveals that residues K755 from 
the IP subdomain and Q497 in the K-tract of the AID 
subdomain interact with upstream DNA. In the human 
Pol y:DNA model, the L752P and A767D substitutions 
would affect the structure in the tip of IP subdomain, 
which contacts the minor groove of the DNA duplex in 
a similar way as reported for T7 Pol [residues T354-V363 
(25)]. In the Pol y:DNA complex, L752 and A767 
neighbor the K768/D769/F770 triplet from fly Pol y; 
triple alanine substitution of KDF in fly Pol y resulted 
in a 1.4-fold decrease in DNA-binding affinity (38). We 
predict that the Alpers A767D mutation likely causes 
similar consequences for DNA binding. Mutations of 
P587 and P589 in the spacer domain are also linked to 
Alpers disease. These residues constrain the p-hairpin 
loop between the IP and AID subdomains and are pos- 
itioned close to the Y452/E453/E454 triplet that lies within 
the thumb [corresponding to the YED triplet in 
Drosophila Pol y (38)]. We postulate that the P587L and 
P589L mutations may suffer the same consequences as for 
the triple alanine substitution of YED in the fly enzyme, 
which caused reduced processivity, most likely as a result 
of misalignment of the ptDNA, with respect to the pol 
catalytic site (38). Furthermore, a triple alanine mutant 
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Figure 3. Alpers Cluster 2 mutations affect the upstream DNA-binding 
channel of Pol y. Amino acid residues affected by Alpers Cluster 2 
mutations in POLGl are shown as yellow spheres. Other Pol y 
residues that are discussed in the text are shown in brown. Spacer 
domain SSEs are shown in magenta and pol domain SSEs are in 
pink according to the schematic shown in Figure 1 ; ptDNA is indicated 
by orange (template) and brown (primer) strands. The spacer domain is 
also shown as a transparent surface representation in pale gray and the 
exo domain is shown in purple. 



of P556A/K557A/L558A in the fly was shown to be com- 
pletely deficient in DNA binding and pol activity (38); this 
defect is likely due to disruption of the hydrophobic struc- 
ture of the spacer domain, which shapes the DNA-binding 
channel wall. Notably, the residue affected in the common 
A467T human disease allele is also part of this PKL 
hydrophobic center, and its biochemical properties most 
likely derive from similar structural alterations that inter- 
rupt hydrophobicity with the introduction of a hydroxyl 
group. Whereas we and others have found that the A467T 
catalytic core alone exhibits highly reduced DNA-binding 
affinity and pol activity (19,21), we showed that these 
defects are mitigated partially by its association with the 
accessory subunit (21). This likely results from partial sta- 
bilization of the overall conformation of mutant catalytic 
core within the reconstituted Pol y holoenzyme. 



CLUSTER 3: MUTATIONS ASSOCIATED WITH A 
NOVEL, POL y-SPECIFIC FUNCTIONAL MODULE, 
CONFERRING PARTITIONING OF DNA 
SUBSTRATE BETWEEN THE POL AND 
EXO ACTIVE SITES 

Comparative analysis of the pol domains from human Pol 
y, T7 Pol and T7 RNA Pol reveals a remarkable structural 
similarity in their overall folds. In fact, the only significant 
difference between the analogous pol domains lies in the 
region that connects the P-helix of the fingers subdomain 
and Q-helix of the palm (Supplementary Figure S2), which 
contains a novel module (Figure 4) whose amino acid 
sequence is highly conserved in Pol ys from yeast to 
man. This, and the finding that several recessive Alpers 
mutations cluster in this region (Figure 1), argue that it 




Figure 4. Alpers Cluster 3 mutations are associated with a novel Pol 
y-specific functional module proposed to be involved in primer strand 
partitioning between the pol and exo sites. Amino acid residues affected 
by Alpers Cluster 3 mutations in POLGl are shown as red spheres/ 
mesh adjacent to a novel alpha helix with an associated loop-hairpin 
(the partitioning loop), also shown in red. The brown spheres and mesh 
represent the SYW (fly)/ SFW (human) and surrounding residues, re- 
spectively, that are described in the text. The pol domain is shown as a 
surface representation in pink and the exo domain is shown in purple, 
according to the schematic shown in Figure 1; ptDNA is indicated by 
orange (template) and brown (primer) strands. (A) The predicted 
position of the partitioning loop relative to ptDNA in the pol mode, 
and (B) represents the exo mode. To dock the frayed ptDNA in the exo 
active site, the exo domain residues 324-518 of Klenow (PDB code 
1KLN) were aligned with the exo domain residues 170-440 of Pol y 
(PDB code 3IKM) (Supplementary Figure S3). 

serves an important role. Indeed, in our Pol y:DNA 
model, this module extends into the DNA-binding 
channel in a position very near the pol active site (<10A 
from the Mg _+ -binding catalytic residues). In bacterio- 
phage N4 RNA polymerase (PDB code 2P04) (39) and 
in T7 RNA polymerase [PDB code 1ARO (40); 
Supplementary Figure S2], the fragment connecting the 
fingers with the helix corresponding to the Q-helix in Pol 
y participates in specific recognition of the transcriptional 
promoter; it is aptly termed a specificity loop (39 41). In 
Pol y, we have adopted the term 'partitioning loop' based 
upon the functional role we propose it serves. We discuss 
our rationale below, and use it to establish structure- 
function justifications of disease manifestation for the 
Alpers mutations we assign to Cluster 3. 

We propose that the partitioning loop modulates the 
partitioning of the primer strand between the pol and 
exo active sites by forming stable contacts with correctly 
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base-paired ptDNA, and destabilizing ptDNA that 
contains mispairs or lesions. Comprising residues 1050 
1095, the atomic structure reveals that the first 10 
residues of the partitioning loop adopt an oc-helical fold 
that extends from the fingers subdomain directly into the 
DNA-binding channel. The beginning of this helix 
contains a W 1049 XGG 1052 motif, which is strictly 
conserved in Pol y, and a similar motif is also present in 
T7 Pol (W 579 XAG 582 ). Our Pol y:DNA model shows that 
the WXXG motif maps to the same region of the fingers 
domain in both Pols. In T7 Pol, it is responsible for 
binding downstream template ssDNA via base-stacking 
interactions between W579 and the nucleotide base 
[PDB code 1T8E (25)], and likely performs an equivalent 
function in Pol y. The novel functionality that the parti- 
tioning loop contributes to Pol y is achieved by its unusual 
loop-hairpin component spanning residues 1060-1074 
that appears to grip the modeled ptDNA along the 
major groove as shown in Figure 4A. We propose that it 
uses a steric exclusion mechanism that confers exquisite 
specificity by associating closely with the major groove of 
primer template, such that correctly base-paired substrates 
are bound stably. In contrast, DNA lesions and mispairs 
that adopt an altered helical structure clash sterically with 
the partitioning loop, resulting in exclusion of the primer 
template from the pol active site. Excluding primer 
template may facilitate fraying of the duplex, generating 
ssDNA primer ends that can be bound by and hydrolyzed 
in the exo active site (Figure 4B). 

Mutations in the residues composing the partitioning 
loop, such as the Alpers mutations R1047W and 
P1073L, as well as the progressive external ophthalmople- 
gia mutations G1051R and G1076V, have been studied 
in vivo. Yeast strains homozygous for G1051R Pol y ex- 
hibited a point mutational frequency > 10-fold higher than 
the wild-type strain, and heterozygous strains showed 
frequencies >2-fold higher relative to homozygous 
wild-type strains (42). Increased point mutation 
frequencies were also reported for yeast strains heterozy- 
gous and homozygous for the Alpers mutation P1073L 
and the G 1076V mutation associated with progressive 
external ophthalmoplegia (22,43). Both the P1073L and 
G 1076V mutations cause substitutions in strictly 
conserved residues located in the loop region that is 
expected to contact DNA, and the G1051R mutation 
affects a strictly conserved glycine in the WXXG motif. 
Interestingly, yeast strains heterozygous and homozygous 
for L304R, S305R, Q308H, R309L and R309H showed 
similar increases in point mutation frequency as well as 
mtDNA depletion (43). These exo domain residues are 
located in a loop-helix motif adjacent to the partitioning 
loop, and have been described previously as the orienter 
module (22) (Figure 4). When S305R and P1073L were 
present as compound heterozygous, the point mutation 
frequency increased drastically to > 70-fold that of the 
wild-type strain (44). Non-complementation of these two 
mutations suggests that both are involved in the same 
function, and we propose that the role of the orienter 
module is to position correctly the partitioning loop. 
L304R, R309H, R309L and W312R were analyzed bio- 
chemically by producing and purifying homologous yeast 



Pol y variants (22). All variants exhibited reduced 
DNA-binding affinity and reduced pol activity and in 
addition, the L304R variant showed a significant 
increase in exo activity (22). An identical biochemical 
phenotype was reported previously in the fly Pol y SYW 
triple alanine variant, which correspond to thumb residues 
S799/F800/W801 in human Pol y (38). As illustrated in 
Figure 4, the SYW residues are located on the face of 
the thumb subdomain directly across the DNA-binding 
channel from the orienter module, and we predict that 
this region of the thumb plays an equivalent role to the 
orienter module. Therefore, we define Cluster 3 as muta- 
tions affecting residues of the partitioning loop, as well as 
nearby residues that govern the position and conform- 
ation of the partitioning loop. Mutations in Cluster 3 
will alter the balance between pol and exo activity and 
diminish the fidelity of the polymerase. We note the pos- 
sibility that enhanced fidelity may also be experimentally 
feasible, introducing the potential to engineer a fine-tuned 
anti-mutator replicase. 

In sum, our justification for the proposed role of the 
partitioning loop is based on its sequence conservation 
among Pol y's from yeast to man, its positioning relative 
to the specificity loop found in T7 RNA Pol, the clustering 
of several recessive Alpers mutations in this region, its 
absence in any other Pol family and the documented 
high fidelity of Pol y. We argue that an alternate hypoth- 
esis that this structural element would simply rotate away 
upon DNA binding (7) seems unlikely based on the above 
considerations and, in particular, on the evolutionary con- 
servation of a novel structural module that is positioned 
strategically in the DNA-binding channel very near the 
pol active site. Clearly, validation of either hypothesis 
warrants future experimentation. 



CLUSTER 4: MUTATIONS AFFECTING POL yA 
INTERACTIONS WITH THE DISTAL POL yB UPON 
DNA BINDING BY POL y HOLOENZYME 

Pol yA contains independent 5—3' DNA polymerase and 
3—5' exonuclease active sites whose functions are 
coordinated in proofreading DNA synthesis. The human 
Pol y crystal structure showed that its homodimeric Pol yB 
interacts asymmetrically with Pol yA, such that one Pol yB 
protomer forms the dominant subunit interface and is 
designated as the proximal accessory subunit, while the 
other protomer makes very limited contact with Pol yA 
and is designated as the distal accessory subunit (7). In 
fact in the apoezyme structure, the Pol yA interaction 
with the distal Pol yB is mediated through one ion bond 
between Pol yA R232 and Pol yB E394 [(7) and Figure 5]. 
A subsequent biochemical study on the interaction 
between the catalytic and accessory subunits showed 
that the proximal Pol yB increases DNA-binding affinity 
of the holoenzyme, while the distal protomer enhances the 
polymerization rate of the holoenzyme (45). Again as sug- 
gested in the section on Cluster 2, we posit that the holo- 
enzyme likely undergoes substantial conformational 
rearrangements in the IP and AID subdomains of the 
spacer upon formation of an active complex with DNA. 
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Figure 5. Alpers Cluster 4 mutations affect Pol yA interactions with 
the distal Pol yB upon DNA binding by Pol y holoenzyme. Amino acid 
residues affected by Alpers Cluster 4 mutations in POLGl are shown as 
blue spheres/ mesh and are located largely within the exo domain 
(shown in purple). The helix shown in magenta is the AID (accessory 
interacting domain) of the spacer region that interacts with the 
proximal accessory subunit. Other domains are colored according to 
the schematic shown in Figure 1; duplex DNA is indicated by the 
orange strand. Orange spheres indicate the positions of the accessory 
subunit residues described in the text. The proximal and distal acces- 
sory subunits are shown as surface representations in light and dark 
gray, respectively. 



Mapping of Alpers mutations within the crystal struc- 
ture revealed that the majority of the mutations in the exo 
domain cluster on the protein surface within 13-17 A from 
Pol yA residue R232 (Cluster 4 shown in blue in Figures 1 
and 5). The functional role of this region can be deduced 
from biochemical studies on mutations of the neighboring 
R232 (46) and L244 residues (22,43). Mutations of R232 
were shown to decrease pol activity, DNA-binding affinity 
and processivity of the holoenzyme yet at the same time, 
to enhance its exonuclease activity, which was also 
rendered less selective for mismatches (46). These data 
argue that in the Pol y:DNA complex, and/or in its 
ternary complex with dNTP, the distal accessory subunit 
associates more tightly with the catalytic subunit, thereby 
enhancing binding of the upstream DNA in the pol mode, 
and limiting the rate of translocation of the frayed 3'-end 
of the primer to the exo site (46). Accordingly, an 
orthologous variant of the L244P human mutation in 
yeast caused increased mutation frequency (22,43). 
Previous site-directed mutagenesis of human Pol yB also 
suggested extensive interaction between the catalytic and 
distal accessory subunits within the ternary complex (5). 
Analysis of the Pol y apoenzyme structure shows that 
E445 and T447 of Pol yB are located at the tip of the its 
C-terminal region, and that only in the distal subunit are 
these residues oriented toward Cluster 4 on the edge of exo 
domain. A double mutation of these residues (E445A/ 
T447A) led to a dramatic decrease of the stimulatory 
effect of Pol yB on the wild-type Pol yA, and a decrease 
in DNA binding (5). We propose that residues in Cluster 4 




Figure 6. Alpers Cluster 5 mutations are proposed to affect replisome 
interactions. Amino acid residues affected by Alpers Cluster 5 muta- 
tions in POLGl are shown as cyan spheres. Spacer domain SSEs within 
the IP (intrinsic processivity subdomain) are shown in magenta; 
ptDNA is indicated by orange (template) and brown (primer) 
strands. The spacer domain is also shown as a transparent surface 
representation in pale gray and the exo domain is shown in purple. 
Brown spheres indicate the positions of the IP residues described in the 
text. 



of Pol yA contribute to tight interaction between exo 
domain of catalytic subunit and the distal accessory 
subunit, resulting in more extensive enclosure of the 
primer template in the pol mode. We also predict the par- 
ticipation of the AID of the spacer domain in these con- 
formational changes, because it is associated intimately 
with the accessory subunit. In support of this, biochemical 
studies performed on the yeast Pol y core variant of Alpers 
mutation R574, which is located in the AID subdomain, 
showed not only reduced DNA-binding affinity, but also 
reduced exo activity and a severe processivity defect 
(22,43). To some extent, these interactions would 
provide structural constraints on the upstream DNA in 
the direction toward the partitioning loop. Therefore, 
the partitioning loop, orienter module and thumb in Pol 
yA, and the interface between its exo domain and the 
distal accessory subunit might function in a concerted 
manner to affect switching between the pol and exo 
modes in Pol y function. 



CLUSTER 5: MUTATIONS AFFECTING A REGION 
OF THE IP SUBDOMAIN THAT IS LIKELY 
INVOLVED IN REPLISOME CONTACTS 

Many reported Alpers mutations (L623W, R627W/Q, 
P648R, G737R, G746S, W748S and F749S) map to the 
distal surface of the IP subdomain (Figure 6). We con- 
sidered these as distinct from the Cluster 2 residues 
because they are located much further from the 
DNA-binding channel. In addition, in vitro studies of 
W748S and R627W/Q variants showed no defects in pol 
activity, processivity or DNA-binding affinity (47,48). In 
contrast, biochemical defects were demonstrated earlier in 
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single alanine variants of fly Pol y within this region of the 
spacer: a W576A variant was nearly inactive, F578A 
retained half of wild-type activity, G575A displayed 
wild-type activity and all three variants showed substan- 
tially reduced stimulation by mtSSB (38). The fly G575/ 
W576/F578 residues correspond to residues G619/W620/ 
Y622 in human Pol y. In the human Pol y apoenzyme 
structure, W620 is buried in the hydrophobic core of the 
IP subdomain, whereas Y622 and G619 are located closer 
to the surface (7). The combined biochemical data suggest 
that residues closer to the distal surface of the IP domain 
are not critical for catalytic function. Cluster 5 mutations 
map to this surface, and therefore a catalytic defect per se 
is not expected to be the cause of Alpers manifestation. 
Rather, we propose that this region is involved in protein- 
protein contacts, a likely partner being mtSSB as sug- 
gested by the GWF fly Pol y variants (38). Another can- 
didate may be the mitochondrial replicative DNA helicase 
Twinkle, but such an interaction has not been investigated 
to date. 

PROSPECTS 

Toward a diagnostic tool to assess new human 
mitochondrial disease mutations 

We show that although the reported Alpers mutations 
scatter across the entire POLG1 gene, they cluster to 
distinct functional regions in the Pol y tertiary structure. 
Table 1 summarizes the structure-function predictions for 
each cluster and also displays the cluster combinations 
that have been reported in compound heterozygous 
patients with Alpers syndrome. These combinations do 
not occur randomly; rather, multiple occurrences of 
specific cluster combinations is evident whereas others 
are absent, suggesting that the latter combinations are 



not tolerated. Cluster 1 mutations locate in the pol 
active site region and will invariably cause reduced pol 
activity, potentially combined with reduced DNA 
binding. Cluster 2 mutations locate in the upstream 
DNA-binding channel, resulting in reduced DNA- 
binding affinity, but are too distant from the catalytic 
site to affect directly pol activity to the extent of Cluster 
1 mutations. Therefore, Cluster 2 mutations will be reces- 
sive, whereas Cluster 1 mutations may be either dominant 
or recessive, depending on the severity of the biochemical 
defect. Cluster 3 mutations locate to the partitioning loop 
or its environs, such as the orienter module, and we 
predict that they will alter the balance of polymerase 
and exonuclease activities of mutant Pol y. Cluster 4 mu- 
tations locate on the surface of the exo domain along the 
interface of the distal accessory subunit. These mutations 
are predicted to reduce the stimulation affected by the 
distal accessory subunit, which has been documented to 
enhance pol rate and to reduce exo activity (45). Both 
Clusters 3 and 4 mutations will likely cause increased mu- 
tagenesis in vivo, a phenotype that has been observed in 
yeast models to be associated with amino acid alterations 
within these clusters (43). Cluster 5 mutations locate on 
the distal surface of the intrinsic processivity (IP) 
subdomain, removed from the pol active site, 
DNA-binding channel and accessory subunit interface. 
Their distant location and lack of biochemical defects 
prompt us to speculate that this region is involved in 
replisome contacts. This hypothesis, in particular, high- 
lights the need for future experimentation on both the 
physical and functional interactions among the key 
proteins at the mtDNA replication fork. 

POLG1 shows significant polymorphic variation in the 
human population, and a constant challenge in DNA 
diagnosis is to distinguish pathogenic mutations from 



Table 1. Structural and functional features of the five proposed Alpers Clusters 



Cluster Structural location Predicted primary biochemical defect 



Predicted phenotype 
(primary, secondary) 



Causes Alpers when 
found in trans 
with Cluster 



Pol active site and Pol activity 
environs 



Upstream 
DNA-binding 
channel 

Partitioning loop 



DNA-binding affinity 



Partitioning of primer strand between 
pol and exo active sites 



Exo-IP interaction Stabilization of ternary complex 
(Stabilized by 



distal Pol yB) 



Periphery of IP 
domain 



Protein-protein interactions 



Reduced pol rate 2, 3 or 5 

Reduced DNA-binding affinity 
Reduced processivity 

Reduced DNA-binding affinity 1, 3, 4 or 5 

Reduced pol rate 
Reduced processivity 

Altered exo/ pol ratio 1, 2 or 5 

Altered substrate specificity 

Altered error rate 

Altered exo rate 

Reduced DNA-binding affinity 

Reduced processivity 

Reduced pol rate enhancement by Pol yB 2 or 5 

Decreased exo specificity 
Increased error rate 
Reduced processivity 

Reduced SSB stimulation, 1, 2, 3 or 4 

Other functions not studied to date 
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Figure 7. Combinations of mutations found in Alpers patients. Individual Alpers mutations are grouped by cluster as shown in Figure 1, and black 
blocks represent Alpers manifesting combinations. The inset in the lower right presents a simplified version of the table by reducing the axes to the 
five clusters only, where gray blocks represent cluster combinations that have not been found in Alpers patients. The tabulated data suggest that two 
mutations from the same cluster do not typically manifest as early-onset Alpers disease. Furthermore, the data indicate that Cluster 4 (blue) 
mutations manifest as Alpers disease only when in combination with Cluster 2 (yellow) or Cluster 5 (cyan) mutations. These trends support the 
existence of unique functional relationships in Pol yA that are inherent to each cluster. 



neutral variants. We suggest that the remarkable cluster- 
ing of Alpers mutations into specific functional regions 
(Figure 1) enables the use of our Pol y:DNA model as a 
diagnosis-supporting tool for evaluating pathogenic po- 
tential of new sequence variants. For example, Q1102 is 
located within the Cluster 1 region, and mutations affect- 
ing this amino acid can be predicted to cause catalytic 
defects. This prediction is consistent with biochemical 
studies on the Ql 102 counterpart in bacteriophage T7 
Pol (Q615), which was shown to be essential for the 
fidelity of nucleotide incorporation and for ptDNA 
binding (25). Functional predictions can be made in the 
absence of biochemical data as well; for example, the tarn 9 
mutant allele, carrying an E595V mutation in the catalytic 
subunit of Pol y, causes mtDNA depletion in the fly 



(49,50). E595V corresponds to E641V in human Pol y, 
which maps to Cluster 5 (Figure 6). Our model predicts 
that the tam 9 -associated mtDNA depletion is caused by 
deficient replisome interactions. Consequently, we would 
predict that both Ql 102 and E641 are potential candidates 
for causing recessive POLG syndrome with mtDNA 
depletion. 

Predicting the consequences of compound heterozygosity 
via Alpers cluster analysis 

Alpers syndrome is typically caused by compound hetero- 
zygosity of two mutations occurring along the entire 
length of the POLG1 gene. In contrast, homozygosity 
for a single mutation often is associated with a 
somewhat milder disorder (2,51). Utilizing the Human 
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DNA Polymerase y Database, we compiled data to show 
all combinations of mutations that have been reported to 
trigger Alpers disease (Figure 7). Alpers patients typically 
do not show a combination of two mutations from the 
same cluster. In addition, Cluster 4 mutations only 
manifest as Alpers disease when combined with Clusters 
2 or 5 mutations. These trends provide merit for our 
structure-guided assignment of the mutations into the 
five proposed clusters, and suggest unique functional 
roles inherent to each. With the rapid development of 
massive parallel sequencing methodologies, the number 
of new Pol yA variants is likely to increase significantly, 
emphasizing the importance of bioinformatic tools to 
evaluate the pathogenic role of identified variants. 
Furthermore, the identification of synergistic functional 
relationships between the characterized clusters provides 
novel insight into the mechanism of Alpers disease mani- 
festation, and will serve as a framework for future struc- 
ture-function studies of the mitochondrial DNA replicase. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online. 
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